-
Notifications
You must be signed in to change notification settings - Fork 475
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] *: Wire up compaction picking with L0Sublevels, enable flush splits #642
[WIP] *: Wire up compaction picking with L0Sublevels, enable flush splits #642
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
existing compaction picker
This isn't quite the existing compaction picker since it includes a number of heuristic changes from #563 that complement but aren't implied by sub-levels, including how other levels are picked.
I did run without these changes, but I don't have a clean record of how badly those runs went since after adding the heuristics here I made further changes like splitting flushes, and some algorithmic improvements to L0SubLevels (like using knowledge of what files in Lbase were already compacting). It may be worth doing a run with L0SubLevels and a trivial integration with compactionPickerByScore
and see what happens, and then use that to justify the heuristics here -- this is the most debatable part of all the changes and there are multiple magic constants.
Ideally we should try with a different stressful workload in addition to the TPCC import -- that may help with tweaking the heuristics, including changing the magic constants. I'm ok with doing that in future PRs, but I am not the one you need to convince :)
Reviewable status: 0 of 20 files reviewed, all discussions resolved (waiting on @petermattis)
This change adds methods to L0SubLevels to help pick, score, and generate L0 -> LBase, and L0 -> L0 compactions, based on information captured in the data structure about L0 sublevels. These functions will be called from in compaction.go and compaction_picker.go in a future change. Also adds associated datadriven unit tests, and a benchmark. Covers a large part of cockroachdb#563. Thanks Sumeer for his work, most of this was written by him.
This change connects the compaction picking methods in L0SubLevels with the existing compaction picker, replacing existing logic for picking L0 compactions. It also enables flush splits, and adds a pebble Option (FlushSplitBytes) to set the threshold for number of bytes that would need to be observed in L0 files since the last split key before an interval boundary is added as a flush split key.
a6beafb
to
ab33764
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 20 files reviewed, 2 unresolved discussions (waiting on @itsbilal and @petermattis)
compaction_picker.go, line 324 at r2 (raw file):
func (p *compactionPickerByScore) initL0Score(inProgressCompactions []compactionInfo) { if p.vers.L0SubLevels != nil {
Will this ever be nil? This was a temporary integration hack for my experiment -- the "else" code should be deletable.
compaction_picker.go, line 509 at r2 (raw file):
// compaction -- see the comment in the if-condition below. // // - We segment compactions into 3 priority levels:
@petermattis since you were asking today. There are some long comments in this file, but I should have written more justification for these priority levels.
The issue is that we have
- two measures for scoring compactions: the score and the currentByteRatios
- the score for L0 is not really comparable with the other scores since it is computed differently
- I did not increase the max number of concurrent compactions since I considered that cheating wrt experimental comparison, so the import experiment continued to run with 3. So we did not have enough compaction capacity to keep the "nice" LSM shape. So the question was whether we can keep an "ok" LSM shape (which is the long comment preceding this). In the real world, given that we have the potential to do many concurrent compactions, we could consider increasing max concurrent compactions if one is provisioned appropriately. I think we will also need to bring back some of the rate throttling code.
Arranging these into a priority total order is non obvious. So the heuristic first tries to categorize so it can side-step constructing a total ordering. One could potentially argue for 2 categories, high and low. During my experiments I was struggling with how to deal with very high score with just 2 categories. 3 categories made it simpler for me:
- highest was really meant for something that has fallen behind too much -- we don't really care which level this is. Just compact it! Things ended up here occasionally, and quickly get out. Note that the ordering within highest is by decreasing score. Most of the time there was nothing in highest.
- high is where we can start playing more interesting games -- this is where most of the levels are: we reorder by preferring higher levels first. This is what makes L0 => Lbase happen a lot, and then Lbase => Lbase+1
- low is the boring category: ordered again by decreasing score. If we get to these we don't really care about
currentByteRatios
-- we have enough resources to maintain a low enough score for all levels.
@sumeerbhola I've answered the questions you asked in #670. This PR was mostly a reference point for any reviewers of past PRs to be able to tie together pieces, but I didn't expect it to be reviewed directly. As I mentioned in #670, my experimentation so far shows that the prioritization done here is less impactful than other changes like flush splits and preIngestDelay changes. The |
@itsbilal This can be closed, right? |
Yes, this can be closed. Thanks for reminding! |
This change connects the compaction picking methods in L0SubLevels
with the existing compaction picker, replacing existing logic for
picking L0 compactions.
It also enables flush splits, and adds a pebble Option (FlushSplitBytes)
to set the threshold for number of bytes that would need to be observed
in L0 files since the last split key before an interval boundary is
added as a flush split key.
Very much a work-in-progress; first commit is #615.